Goto

Collaborating Authors

 ground truth box





Target Detection of Safety Protective Gear Using the Improved YOLOv5

Liu, Hao, Qin, Xue

arXiv.org Artificial Intelligence

In high-risk railway construction, personal protective equipment monitoring is critical but challenging due to small and frequently obstructed targets. We propose YOLO-EA, an innovative model that enhances safety measure detection by integrating ECA into its backbone's convolutional layers, improving discernment of minuscule objects like hardhats. YOLO-EA further refines target recognition under occlusion by replacing GIoU with EIoU loss. YOLO-EA's effectiveness was empirically substantiated using a dataset derived from real-world railway construction site surveillance footage. It outperforms YOLOv5, achieving 98.9% precision and 94.7% recall, up 2.5% and 0.5% respectively, while maintaining real-time performance at 70.774 fps. This highly efficient and precise YOLO-EA holds great promise for practical application in intricate construction scenarios, enforcing stringent safety compliance during complex railway construction projects.


Scenario Diffusion: Controllable Driving Scenario Generation With Diffusion

Pronovost, Ethan, Ganesina, Meghana Reddy, Hendy, Noureldin, Wang, Zeyu, Morales, Andres, Wang, Kai, Roy, Nicholas

arXiv.org Artificial Intelligence

Automated creation of synthetic traffic scenarios is a key part of validating the safety of autonomous vehicles (AVs). In this paper, we propose Scenario Diffusion, a novel diffusion-based architecture for generating traffic scenarios that enables controllable scenario generation. We combine latent diffusion, object detection and trajectory regression to generate distributions of synthetic agent poses, orientations and trajectories simultaneously. To provide additional control over the generated scenario, this distribution is conditioned on a map and sets of tokens describing the desired scenario. We show that our approach has sufficient expressive capacity to model diverse traffic patterns and generalizes to different geographical regions.


Unbalanced Optimal Transport: A Unified Framework for Object Detection

De Plaen, Henri, De Plaen, Pierre-François, Suykens, Johan A. K., Proesmans, Marc, Tuytelaars, Tinne, Van Gool, Luc

arXiv.org Artificial Intelligence

During training, supervised object detection tries to correctly match the predicted bounding boxes and associated classification scores to the ground truth. This is essential to determine which predictions are to be pushed towards which solutions, or to be discarded. Popular matching strategies include matching to the closest ground truth box (mostly used in combination with anchors), or matching via the Hungarian algorithm (mostly used in anchor-free methods). Each of these strategies comes with its own properties, underlying losses, and heuristics. We show how Unbalanced Optimal Transport unifies these different approaches and opens a whole continuum of methods in between. This allows for a finer selection of the desired properties. Experimentally, we show that training an object detection model with Unbalanced Optimal Transport is able to reach the state-of-the-art both in terms of Average Precision and Average Recall as well as to provide a faster initial convergence. The approach is well suited for GPU implementation, which proves to be an advantage for large-scale models.


Generating Driving Scenes with Diffusion

Pronovost, Ethan, Wang, Kai, Roy, Nick

arXiv.org Artificial Intelligence

In this paper we describe a learned method of traffic scene generation designed to simulate the output of the perception system of a self-driving car. In our "Scene Diffusion" system, inspired by latent diffusion, we use a novel combination of diffusion and object detection to directly create realistic and physically plausible arrangements of discrete bounding boxes for agents. We show that our scene generation model is able to adapt to different regions in the US, producing scenarios that capture the intricacies of each region.


Confident Object Detection via Conformal Prediction and Conformal Risk Control: an Application to Railway Signaling

Andéol, Léo, Fel, Thomas, De Grancey, Florence, Mossina, Luca

arXiv.org Artificial Intelligence

Deploying deep learning models in real-world certified systems requires the ability to provide confidence estimates that accurately reflect their uncertainty. In this paper, we demonstrate the use of the conformal prediction framework to construct reliable and trustworthy predictors for detecting railway signals. Our approach is based on a novel dataset that includes images taken from the perspective of a train operator and state-of-the-art object detectors. We test several conformal approaches and introduce a new method based on conformal risk control. Our findings demonstrate the potential of the conformal prediction framework to evaluate model performance and provide practical guidance for achieving formally guaranteed uncertainty bounds.


CornerNet : Detecting Objects as Paired Keypoints

#artificialintelligence

CornerNet is a different object detection technique where we detects the objects bounding box by a paired key-points, the top-left corner and the bottom-right corner using a single convolution neural network. By detecting the key points, it eliminates the need of different anchor boxes commonly used in single stage detectors. In this paper by Hei Law and Jia Deng from Princeton University, they have introduced a new approach to object detection which outperforms all the single stage detectors. CornetNet introduces a new type of pooling layer called Corner Pooling, that helps localizing the corners. The Net achieves 42.2% AP on MS COCO dataset.


Everything You Need to Know About Object Detection Systems

#artificialintelligence

With the advent of deep learning, implementing an object detection system has become fairly trivial. There are a great many frameworks facilitating the process, and as I showed in a previous post, it's quite easy to create a fast object detection model with YOLOv5. However, understanding the basics of object detection is still quite difficult. It involves a lot of math, and the variable number of outputs/bounding boxes makes it harder to understand than image classification, where we know the number of outputs beforehand. With so many moving parts and new concepts introduced over the history of object detection, it certainly hasn't gotten easier. In this post, I'll distill all this history into a simple guide that explains all the details of object detection and instance segmentation systems. The classic image classification problem is very well known: given an image, can you find the class the image belongs to? We can solve any new image classification problem with ConvNets and transfer learning using pre-trained nets where Convnets are fixed feature extractors.